Most of the presentations in this Jupyter not book is in the form of animation or interaction it takes time to completely run. That is why in addition to Jupyter notebook all animations, graphs are available with attachment and also Jupyter not book saved in HTML format too. Also, alliteration and speeds are adjusted in a way we have fast run time if you interested to see real results use videos are exist in the attachment.
Gliders data are an important source of observations for oceanographers. The Dalhousie glider was prepared by the Coastal Environmental Observation Technology and Research (CEOTR) group (ceotr.ocean.dal.ca). Support for the deployment and operation of this mission was provided by the Nunatsiavut Government, Oceans North, and the Ocean Tracking Network. Data were collected in a region governed by the Nunatsiavut Government. Please credit the Nunatsiavut Government when using these data. Please contact the Nunatsiavut Government Research Advisory Committee (NGRAC) at research@nunatsiavut.com for information about using data within the Labrador Inuit Settlement Area. For purpose of visualizing data Slocum Glider data which collected through the North Atlantic Ocean and traveled 948.8 km. The deployment started in 2019 Sep 2, 4:20 am NST and ended by 2019 Oct 28, 4:03 pm NST. Mission 105 data column set applied for this project. That set in the format of csv file and can be downloaded from this link: Consider mission #105:
In order to clean the data set first all columns are renamed and given a meaningful name. Clean up column names done by stripping units from names. Then according to the information given by the customer either depth or pressure is recorded, but never both at the same time. Split the data into two data frames: one with depths and the Pressure.
(1 m of depth is the same as 1 dbar of pressure) Pressure and depth though are very closely related.
In this work, by deleting one Nan value we will be miss all relevant data in 1 row of data set. What is why we avoided that to have a full path which glider passed.
Filtering the data used frequently here(Limited days and times are considered )
The notebook provides a visual aid to assist in the identification of the chemical and physical properties of the ocean.The main points are:
To represents ocean data in graphs and more scientific ways.
Interpretation of graphs of oceanographic data is easier than row data type or tabular data.
how the variables change from the surface to depth and different latitude and longitudes -and different times of day or month or even a year. • Temperature: how temp changes with depth. • Salinity: how salinity changes with depth.
The given data set is in tabular format. Each cell contains measurement of a continuous domain. It measures continuous phenomena like pressure, depth, temperature, salinity, and density.
That set columns and their definitions are available in the table below:
| Name of attribute /Type | Description |
|---|---|
| time (UTC)/Ordered | Exact date and time of collection ofdata in each poin |
| depth (m)/ Quantitative | Diving depth underwater. Explain by meter |
| latitude (degrees_north)/Spatial position/ Quantitative | The position of the region |
| longitude (degrees_east)/Spatial position/ Quantitative | The position of the region |
| conductivity (S.m-1)/ Quantitative | Conductivity of the water in a specific position |
| temperature (Celsius)/ Quantitative | Temperature of the water in a specific position |
| salinity (psu)/ Quantitative | Salinity of the water in a specific position |
| density (kg.m-3)/ Quantitative | Density of the water in a specific position |
| pressure (dbar)/ Quantitative | pressure in a specific position |
| profile_id/ Categorical | Identifier the glider |
Why analysis In this visualization in order to answer the above-mentioned questions from Munzners methodology point of view, some actions must be taken to reach our goal. For this reason, we define our actions like analyzing the given information in order to present it in the user-friendly frame. From this data, we derive necessary information for the user.So, our action list is: Present,Summarize. All these activities are for reaching specific targets like presenting Temperature, Salinity, Conductivity, and Density in special latitude and longitude. In a 3D plots, latitude longitude and also depth in a special geographical place considered. So, distribution and dependency will be our target in this visualization.
In my plots, I visualized the interaction between nearly all variables that play an important role in analyzing data. These variables Conductivity, Temperature, Salinity, Density and their positions in special depth or pressure.
How part is sepertly will be disscuss for each graph.
A client is Dr. James Munroe from department of Physical Oceanography.
Contact Information
Office: C4060
Lab: C1051
Phone: (709) 864-7362
Email: jmunroe @ mun . ca
I didn't have the opportunity for direct client contact. The client was not physically on the St. John’s campus. Trying to resolve an issue customer suggested having a video conference over “Zoom”. Most of our conversation and contact did over zoom and mail. Just one time I had a chance to have face to face conversation with him.
Most of our conversation and meetings are done in PM ours because the customer was busy during the day.
Before our first contact, he sends some useful links to me to understand the nature of the data. These links are available below:
https://portal.secoora.org/#search?type_group=all&tag|tag=gliders&page=1
https://gliders.ioos.us/map/
The customer was interested to visualize the data in a dynamic way. He wanted to incorporate the movement of the glider into visualization.
The client had very good experience in data visualization and data science then more than solving problems he helped me to discover the good and practical method in data visualization.
%matplotlib inline
import requests
import os
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
Consider mission #105:
The data is downloadable as a .csv file.
data_url = "http://ceotr.ocean.dal.ca/erddap/tabledap/Nemesis_20190902_105_realtime.csvp?time%2Cdepth%2Clatitude%2Clongitude%2Cconductivity%2Ctemperature%2Csalinity%2Cdensity%2Cpressure%2Cprofile_id"
data_filename = 'mission105.csv'
df = pd.read_csv(data_filename, parse_dates=True, index_col=0)
df.head()
Clean up column names by stripping units from names.
newnames = { column: column.split(' ')[0] for column in df.columns }
df = df.rename(columns=newnames)
Either depth or pressure is recorded, but never both at the same time. Split the data into two dataframes: one with depths and the other for temperature / salinity / pressure.
position = df[['depth', 'latitude', 'longitude']].dropna()
profile = df.drop(columns=['depth']).dropna()
profile_subset = profile[(profile.index > '2019-09-07') & (profile.index < '2019-09-14')]
position_subset = position[(position.index > '2019-09-07') & (position.index < '2019-09-14')]
profile_r = profile.resample('H').mean()
position.describe()
profile.describe()
#plt.close(f)
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
font_size = 20
ylabel = 'pressure'
y = profile_subset.pressure
xlabel2 = 'density'
x2 = profile_subset.density
xlabel3 = 'Salinity'
x3 = profile_subset.salinity
xlabel4 = 'Temperature (C)'
x4 = profile_subset.temperature
xlabel5 = 'conductivity'
x5 = profile_subset.conductivity
# Four-panel plot
fig3, (ax) = plt.subplots(2,2,sharey=True,figsize=(40, 20))
# ===========
ax2 = ax[0,0]
ax3 = ax[0,1]
ax4 = ax[1,0]
ax5 = ax[1,1]
pcm = ax2.scatter(profile_subset.index, y, c=x2)
ax2.set_ylim(200, 0)
ax2.set_xlim('2019-09-09', '2019-09-14')
ax2.set_title('Density by Pressure (Depth)', fontsize=font_size)
ax2.set_xlabel("Date", fontsize=font_size)
ax2.set_ylabel(ylabel, fontsize=font_size)
fig3.colorbar(pcm, ax=ax2)
# ===========
pcm = ax3.scatter(profile_subset.index, y, c=x3)
ax3.set_ylim(200, 0)
ax3.set_xlim('2019-09-07', '2019-09-14')
ax3.set_title('Salinity by Pressure (Depth)', fontsize=font_size)
ax3.set_xlabel("Date", fontsize=font_size)
fig3.colorbar(pcm, ax=ax3)
# ===========
pcm = ax4.scatter(profile_subset.index, y, c=x4)
ax4.set_ylim(200, 0)
ax4.set_xlim('2019-09-09', '2019-09-14')
ax4.set_title('Temperature by Pressure (Depth)', fontsize=font_size)
ax4.set_xlabel("Date", fontsize=font_size)
fig3.colorbar(pcm, ax=ax4)
# ===========
pcm = ax5.scatter(profile_subset.index, y, c=x5)
ax5.set_ylim(200, 0)
ax5.set_xlim('2019-09-07', '2019-09-14')
ax5.set_title('Conductivity by Pressure (Depth)', fontsize=font_size)
ax5.set_xlabel("Date", fontsize=font_size)
fig3.colorbar(pcm, ax=ax5)
import matplotlib.animation as animation
from datetime import datetime, timedelta
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
import pandas as pd
import numpy as np
font_size = 20
ylabel = 'pressure'
y = profile_subset.pressure
xlabel2 = 'density'
x2 = profile_subset.density
xlabel3 = 'Salinity'
x3 = profile_subset.salinity
xlabel4 = 'Temperature (C)'
x4 = profile_subset.temperature
xlabel5 = 'conductivity'
x5 = profile_subset.conductivity
original_date = '2019-09-07'
dt = datetime.strptime(original_date, '%Y-%m-%d').date()
# Four-panel plot
fig3, ax = plt.subplots(2,2,figsize=(40, 20), gridspec_kw={'hspace': 0.2})
def updateax1(angle):
dmin = dt + timedelta(days=angle)
dmax = dt + timedelta(days=angle+7)
profile_subset = profile[(profile.index > str(dmin)) & (profile.index < str(dmax))]
cs = ax[0, 0].scatter(profile_subset.index, profile_subset.pressure, c=profile_subset.density)
ax[0, 0].set_ylim(220, 0)
ax[0, 0].set_xlim(dmin, dmax)
ax[0, 0].tick_params(axis="x", labelsize=font_size)
ax[0, 0].tick_params(axis="y", labelsize=font_size)
ax[0, 0].set_title('Density by Pressure (Depth)', fontsize=font_size)
ax[0, 0].set_xlabel("Date", fontsize=font_size)
ax[0, 0].set_ylabel("Pressure (Depth)", fontsize=font_size)
cs = ax[0, 1].scatter(profile_subset.index, profile_subset.pressure, c=profile_subset.salinity)
ax[0, 1].set_ylim(220, 0)
ax[0, 1].set_xlim(dmin, dmax)
ax[0, 1].tick_params(axis="x", labelsize=font_size)
ax[0, 1].tick_params(axis="y", labelsize=font_size)
ax[0, 1].set_title('Salinity by Pressure (Depth)', fontsize=font_size)
ax[0, 1].set_xlabel("Date", fontsize=font_size)
ax[0,1].set_ylabel("Pressure (Depth)", fontsize=font_size)
cs = ax[1, 0].scatter(profile_subset.index, profile_subset.pressure, c=profile_subset.temperature)
ax[1, 0].set_ylim(220, 0)
ax[1, 0].set_xlim(dmin, dmax)
ax[1, 0].tick_params(axis="x", labelsize=font_size)
ax[1, 0].tick_params(axis="y", labelsize=font_size)
ax[1, 0].set_title('Temperature by Pressure (Depth)', fontsize=font_size)
ax[1, 0].set_xlabel("Date", fontsize=font_size)
ax[1,0].set_ylabel("Pressure (Depth)", fontsize=font_size)
cs = ax[1, 1].scatter(profile_subset.index, profile_subset.pressure, c=profile_subset.conductivity)
ax[1, 1].set_ylim(220, 0)
ax[1, 1].set_xlim(dmin, dmax)
ax[1, 1].tick_params(axis="x", labelsize=font_size)
ax[1, 1].tick_params(axis="y", labelsize=font_size)
ax[1, 1].set_title('Conductivity by Pressure (Depth)', fontsize=font_size)
ax[1, 1].set_xlabel("Date", fontsize=font_size)
ax[1,1].set_ylabel("Pressure (Depth)", fontsize=font_size)
print (angle)
return cs,
days = 30
ani1=animation.FuncAnimation(fig3, updateax1, frames=days, interval=1000, blit=True)
ani1.save('scatter.mp4')
The scater plots used to present the Density,Salinity,Temprature and Conductivity in different depth and different dates.
Idiom : Animated - Scatterplots
What? Data is in tabular format
Scatter plot (Both statistic and animation interactive) :
2D scatter plot: A date Which is ordered arribute considered as shared axes.Different attributes like Temperature, Density, Conductivity, and Salinity are illustrated in 2D space.All of Y axes attributes are Quantitative attributes. In each facet one quantitative and one orderded value considered.
Why?
Find trends, Distribution
Action:Analysis,Present
Target: Present the distribution
Scale: Items:hundreds
how:
Encode : The color range starts from blue to yellow. Blue presents low value and yellow is high values.
Scatterplots are effective for the abstract task of providing overviews and characterizing distribution. Scatterplots are highly effective for the abstraction task.
Mark:The mark is necessarily a point.Each point mark presents a Temperature or Density OR Conductivity,or Salinity.
Channel:The color channel used for the Scale items.
Facet: help to have all information all beside each other.
Manipulate:Animated transaction
import matplotlib.animation as animation
from mpl_toolkits.mplot3d import Axes3D
font_size = 10
fig = plt.figure(figsize=(20,15))
step = 20
def updateax1(angle):
ax = fig.add_subplot(2, 2, 1, projection='3d')
# cs = ax.plot_trisurf(profile.latitude, profile.longitude, profile.pressure, cmap=plt.cm.jet, linewidth=0.01)
cs = ax.scatter3D(profile.latitude, profile.longitude, profile.pressure, c=profile.conductivity, cmap='hsv');
ax.set_title('Pressure by (Depth)', fontsize=font_size)
#ax.set_xlabel("Date", fontsize=font_size)
#ax.set_ylabel("Pressure (Depth)", fontsize=font_size)
ax.view_init(45,angle*step)
ax = fig.add_subplot(2, 2, 2, projection='3d')
# cs = ax.plot_trisurf(profile.latitude, profile.longitude, profile.temperature, cmap=plt.cm.jet, linewidth=0.01)
cs = ax.scatter3D(profile.latitude, profile.longitude, profile.pressure, c=profile.temperature, cmap='hsv');
ax.set_title('Temprature by Pressure (Depth)', fontsize=font_size)
#ax.set_xlabel("Date", fontsize=font_size)
#ax.set_ylabel("Pressure (Depth)", fontsize=font_size)
ax.view_init(45,angle*step)
ax = fig.add_subplot(2, 2, 3, projection='3d')
# cs = ax.plot_trisurf(profile.latitude, profile.longitude, profile.salinity, cmap=plt.cm.jet, linewidth=0.01)
cs = ax.scatter3D(profile.latitude, profile.longitude, profile.pressure, c=profile.salinity, cmap='hsv');
ax.set_title('Salinity by Presure(Depth)', fontsize=font_size)
#ax.set_xlabel("Date", fontsize=font_size)
#ax.set_ylabel("Pressure (Depth)", fontsize=font_size)
ax.view_init(45,angle*step)
ax = fig.add_subplot(2, 2, 4, projection='3d')
# cs = ax.plot_trisurf(profile.latitude, profile.longitude, profile.density, cmap=plt.cm.jet, linewidth=0.01)
cs = ax.scatter3D(profile.latitude, profile.longitude, profile.pressure, c=profile.density, cmap='hsv');
ax.set_title('Density by Pressure (Depth)', fontsize=font_size)
#ax.set_xlabel("Date", fontsize=font_size)
#ax.set_ylabel("Pressure (Depth)", fontsize=font_size)
ax.view_init(45,angle*step)
print(angle)
return cs,
ani1=animation.FuncAnimation(fig, updateax1, frames=30, interval=500, blit=True)
ani1.save('scatter3D.mp4')
A position is spatial data in 3 dimensions.Latitude and longtitude and depth which also can be represent by preserure was considered to present 3 axes. Here different attributes like Temperature, Density, Conductivity and Salinity are illustrated in 3D space by considering Latitude, Longitude and depth of each point. that glider passes and measures the data. Each position value are presented.It is usefull to illustrate the destribution of data.
Idiom: 3D-Scatter plot-animation
What: Data: Data is in tabular format
In each facet one quantitative and 3 spacial values considered to show the position.
Why? Task
Action:present
Target:Find Distribution
mark: is necessarily a point.
Scatterplots are effective for the abstract task of providing overviews and characterizing distribution. Scatterplots are highly effective for the abstraction task.
Scale:Items hundreds
How:
Encode: 3 three orthogonal directions. Color range starts from blue to yellow. Blue presents low value and yellow is high values.
Mark:The mark is necessarily a point.Each point mark presents a Temperature or Density OR Conductivity,or Salinity.Horizontal and vertical spatial position encoding the primary quantitative attributes.
Channel:Channel is color difference.
Facet: help to have all information all beside each other.
Manipulate: Animated transaction
%matplotlib notebook
df_ = profile[(profile.index > '2019-09-11') & (profile.index < '2019-09-20')]
newindex = pd.DatetimeIndex(df.index)
#df = df.set_index(newindex)
df_ =df_.resample('1h').mean()
width_in_inches=10
height_in_inches =10
dots_per_inch = 70
f = plt.figure(
figsize=(width_in_inches, height_in_inches),
dpi=dots_per_inch)
plt.subplot(3,1 ,1,title='Temperature-Date')
df_['temperature'].plot()
plt.subplot(3, 1,2)
plt.subplot(3,1 ,2,title='Salinity-Date')
df_['salinity'].plot()
plt.subplot(3,1,3)
plt.subplot(3,1 ,3,title='Density-Date')
df_['density'].plot()
f.savefig('test.png')
idiom:Interactive-Line chart
On one side we have quantitative attributes like Temperature, Salinity, Density and on the other side, we have a categorical time attribute. A line chart is a great wat to illustrate peak dip and temporal or seasonal trends.
whatData is in Tabular format
DATA One shared Categorical value and 3 quantitative value for each facet.
look values find a trend,showing different values like Temperature, Salinity, Density in different days of year, distribution, precent,show the trend
Why:
Action: Look up,Show trends
Target: Dependancy
How:
Encode: In this figure line chart, which is combined which systematic zooming and filtering part of the chart by selecting special time series for display.User easily can navigate over time. Horizontal axes are Temperature or Salinity or Density which changes over time. And shared x-axis is time series.
Ocean data not change rapidly during the time, to avoid redundancy the data resampled to 1Hour.
Direct comparison between curves height at all times is easy. On x-axis date and time are presented. It is a very traditional and successful way of temporal patterns.
Encode:Dot charts with connection between dots.
Items are filters and reduced by resampling the data to 1Hour.
Channels: blue lines with presents different trends.
Marks: Line connection between dots.
Scale: Hundreds of levels
Manipulate: Navigate with pan, Sistematic zooming
# To close the Navigation Window
plt.close(f)
gmap3 = gmplot.GoogleMapPlotter(57.3164945,
-66.000,7)
# scatter method of map object
# scatter points on the google map
gmap3.scatter(position.latitude,position.longitude, '# FF0000',
size = 40, marker = False )
# Plot method Draw a line in
# between given coordinates
gmap3.plot(position.latitude,position.longitude ,
'orange', edge_width = 2.5)
gmap3.draw( "Desktop\\map3.html" )
# read data
%matplotlib inline
import csv
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import requests
import os
import pandas as pd
import matplotlib.pylab as plt
from datetime import datetime
import matplotlib.dates as mdates
from matplotlib.pyplot import *
from mpl_toolkits.axes_grid1 import host_subplot
import matplotlib.animation as animation
## import numpy as np
import matplotlib.pyplot as plt
from IPython.display import HTML
#your path should be:
#data_path = 'data/Nemesis_20190902_105_realtime_3977_cd22_d42e.csv'
data_path = 'mission105.csv'
with open(data_path, 'r') as f:
reader = csv.reader(f, delimiter=',')
# get header from first row
headers = next(reader)
# get all the rows as a list
data = list(reader)
# transform data into numpy array
#data = np.array(data).astype(float)
data=np.array(data)
#data.dropna().describe()
dataDate = np.array(data[:,0]).astype(datetime)
dataDepth = np.array(data[:,1]).astype(float)
dataGeo = np.array(data[:,2:4]).astype(float)
dataTemp = np.array(data[:,5]).astype(float)
datasalinity= np.array(data[:,6]).astype(float)
dataDensity = np.array(data[:,7]).astype(float)
dataPress = np.array(data[:,8]).astype(float)
print(headers)
print(dataGeo.shape)
#print(data[:30])
#plt.plot(dataGeo[:, 0], dataGeo[:, 1])
#plt.axis('equal')
#plt.xlabel(headers[3])
#plt.ylabel(headers[2])
#plt.show()
ReadingSpeed=800
x1 = dataGeo[:,1]
y1 =dataGeo[:,0]
x2=np.arange(0,x1.shape[0],1)
y2=dataTemp
x3=np.arange(0,x1.shape[0],1)
y3=dataPress
x4=np.arange(0,x1.shape[0],1)
y4=datasalinity
print(x1.shape)
#f0 = figure(num = 0, figsize = (12, 8))#, dpi = 100)
#f0.suptitle(, fontsize=12)
#ax1 = subplot2grid((2, 2), (0, 0))
#ax2 = subplot2grid((2, 2), (0, 1))
#ax3 = subplot2grid((2, 2), (1, 0))
fig = plt.figure(figsize = (12, 8))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(224)
ax4 = fig.add_subplot(223)
ax1.xlim=(min(x1), max(x1))
ax1.ylim=(min(y1), max(y1))
line1, = ax1.plot(x1, y1, color = "r")
line2, = ax2.plot(x2, y2, color = "b")
line3, = ax3.plot(x3, y3, color = "g")
line4, = ax4.plot(x4, y4, color = "r")
ax1.grid(True)
#img = plt.imread("ocian.jpg")
#ax1.set_xlim(0,max(x1))
#ax1.set_ylim(0,max(y1))
img = plt.imread("ocian.jpg")
#ax1.imshow(img)
#ax1.imshow(img,extent=[0, max(x1), 0, max(y1)], animated=True)
def update(num, x1, y1,line1, x2,y2,line2, x3,y3,line3,x4,y4,line4):
line1.set_data(x1[:num*ReadingSpeed], y1[:num*ReadingSpeed])
line2.set_data(x2[:num*ReadingSpeed], y2[:num*ReadingSpeed])
line3.set_data(x3[:num*ReadingSpeed], y3[:num*ReadingSpeed])
line4.set_data(x4[:num*ReadingSpeed], y4[:num*ReadingSpeed])
return [line1,line2,line3,line4]
ani = animation.FuncAnimation(fig, update, (len(x1)//ReadingSpeed),fargs=[x1,y1,line1, x2,y2,line2, x3,y3,line3,x4,y4,line4],interval=20, blit=True)
ax1.set_xlabel(headers[3])
ax1.set_ylabel(headers[2])
ax2.set_ylabel(headers[5])
ax3.set_ylabel(headers[6])
ax4.set_ylabel(headers[7])
#plt.show()
#Uncomment the next line if you want to show the animation in jupyter
ax1.imshow(img,zorder=0, extent=[min(x1), max(x1), min(y1), max(y1)])
HTML(ani.to_html5_video())
# Uncomment the next line if you want to save the animation
#ani.save(filename='sim.mp4',fps=30,dpi=300)
Conclusion Animation chart: Four facet diagram is a good option for summerise the data because it has the capability to include spacial position of data and also related Temprature, Salinity and Density in each point.
Idiom:Dense Software Overwiews
What? The type of dataset used is tabular format , consisting of rows and columns.Temperature, Salinity and Density are Quantitative attributes. Lattitude and Logtitude are spatial data types.
why?
Ation:precent,Locate
Target:
Summrise and present
How?
The last one as a sequence of many frames, where the viewer can control the play backing, pausing or stopping the frame. Animation of true interactive form because here we interested to present navigation which done by glider and measured special fields in each point and time of navigation.
With animation, the user does not directly control what occurs, only the speed at which the animation played.
Giving the people the ability to pause replay the animation much better than only seeing the single time.
Facet gives portion of multiple view all with eachother.
Encode:Chats to show trends and summrise the data.
Manipulate:Animated transaction.
Channel: attribute is motion.
Thus, applying Munzner's method for visualization we can easily analyze the infomation we have; information we can derive and prepare a proper visualization that would be simple yet informative. These visual tools can be really helpful to formulate plans, make decision and convey proper instruction. Munzer has discussed numbers of really useful visual tools and has provided the essence of these tools in terms of 3WH questions (What, Why and How).How analysis part direct us towards choosing proper idioms, channels and marks in our visualisation.Performed actions from why part of this analysis helped us to reach our goals.according to type of attributes. Her explanation were really indepth and persuasive. But Visualization has few limitation too. Some of the limitation can be listed as: